# High Accuracy Transcription
Parakeet Tdt Ctc 0.6b Ja
This model is a Japanese automatic speech recognition (ASR) model based on the FastConformer architecture, developed by NVIDIA and converted to MLX format.
Speech Recognition
P
mlx-community
368
1
Stt Ru Fastconformer Hybrid Large Pc Onnx
NVIDIA FastConformer-Hybrid Large is a Russian automatic speech recognition model based on the FastConformer architecture, supporting CTC and RNN-T decoders.
Speech Recognition
S
istupakov
163
1
Whisper Custom Small
Apache-2.0
A small speech recognition model based on the OpenAI Whisper architecture, focused on English speech-to-text tasks.
Speech Recognition English
W
gyrroa
15
1
Whisper Large V3 Turbo Shqip
MIT
An Albanian-optimized speech recognition model based on OpenAI Whisper Large v3 Turbo, supporting standard Albanian and Gheg dialect
Speech Recognition
Transformers Other

W
Kushtrim
143
4
Voice Clone Large Finetune Final
Apache-2.0
This model is a voice cloning model fine-tuned based on openai/whisper-large-v3, primarily used for speech recognition tasks, achieving a word error rate of 15.3572 on the evaluation set.
Speech Recognition
Transformers

V
neuronbit
37
2
Whisper Large V3 Gguf
Apache-2.0
Whisper is a multilingual automatic speech recognition (ASR) system that supports speech-to-text tasks in multiple languages.
Speech Recognition Supports Multiple Languages
W
vonjack
931
14
Belle Whisper Large V3 Zh
Apache-2.0
A Chinese speech recognition model fine-tuned and optimized based on whisper-large-v3, showing significant performance improvements in multiple Chinese speech benchmarks
Speech Recognition
Transformers

B
BELLE-2
1,666
112
Stt Fa Fastconformer Hybrid Large
This is a hybrid model for Persian Automatic Speech Recognition (ASR), combining transducer and CTC decoder losses, optimized based on the FastConformer architecture.
Speech Recognition Other
S
nvidia
2,398
9
Whisper Large V3 German
Apache-2.0
A fine-tuned German speech recognition model based on Whisper Large v3, optimized for German speech processing and recognition
Speech Recognition
Transformers German

W
primeline
8,745
70
Wav2vec2 Base 960h
ONNX format conversion of Facebook's wav2vec2-base-960h model, designed for Transformers.js, supporting browser-side speech recognition
Speech Recognition
Transformers

W
Xenova
117
3
Wav2vec2 Large Xlsr 53 English
Large-scale speech recognition model based on the wav2vec 2.0 architecture, supporting English speech-to-text conversion
Speech Recognition
Transformers

W
Xenova
14
2
Faster Whisper Large V2 Mix Jp
This is the CTranslate2 converted version of the whisper-large-v2-mix-jp model, suitable for Japanese speech recognition tasks
Speech Recognition Japanese
F
arc-r
64
9
ASCEND Dataset Model
Apache-2.0
A fine-tuned speech recognition model based on facebook/wav2vec2-xls-r-300m, trained on the ASCEND dataset
Speech Recognition
Transformers

A
GleamEyeBeast
22
0
Wav2vec2 Base Libir Zenodo
Apache-2.0
This model is a fine-tuned speech recognition model based on facebook/wav2vec2-base-960h on an unknown dataset, primarily used for automatic speech recognition tasks.
Speech Recognition
Transformers

W
samantharhay
25
0
Wav2vec2 Gujarati Stt
This is a Gujarati speech recognition model based on the Wav2Vec2 architecture, capable of directly converting Gujarati speech into text.
Speech Recognition
Transformers

W
addy88
18
0
Wav2vec2 Gpt2 Wandb Grid Search
Automatic Speech Recognition (ASR) model trained on the LibriSpeech dataset
Speech Recognition
Transformers

W
sanchit-gandhi
13
0
Wav2vec2 Kannada Stt
A Kannada speech recognition model based on the Wav2Vec2 architecture, capable of directly converting Kannada speech into text.
Speech Recognition
Transformers

W
addy88
96
1
Wav2vec2 Urdu Stt
This is a Urdu speech recognition model based on the Wav2Vec2 architecture, capable of converting Urdu speech into text.
Speech Recognition
Transformers

W
addy88
145
0
Wsj0 Full Supervised
Apache-2.0
This model is a speech recognition model fine-tuned on the WSJ0 dataset based on facebook/wav2vec2-large-lv60, achieving a word error rate of 0.0343 on the evaluation set.
Speech Recognition
Transformers

W
Kuray107
26
0
Featured Recommended AI Models